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This paper deals with chain graphs under the alternative Andersson-Madigan-Perlman 
(AMP) interpretation. In particular, we present a constraint based algorithm for learning 
an AMP chain graph a given probability distribution is faithful to. We also show that the 
extension of Meek's conjecture to AMP chain graphs does not hold, which compromises the 
development of efficient and correct score+search learning algorithms under assumptions 
weaker than faithfulness. 



1 Introduction 

This paper deals with chain graphs (CGs) under 
the alternative And ersson-Madigan-Perlma n 
(AMP) interpretation ( Andersson et all 200ll ). 
In particular, we present an algorithm for learn- 
ing an AMP CG a given probability distribution 
is faithful to. To our knowledge, we are the 
first to present such an algorithm. However, 
it is worth mentioning that, under the classi- 
cal Lauritzen-Wermut h-Frydenbe r g (LW F) in- 



terpretation of CGs ( Lauritzen . 1996|) , such 



an algor i thm already exists (|Ma et all 2008 



Studenvl . Il997h . Moreover, we have recently 



developed an algorithm for learning LWF CGs 
unde r the milder comp osition property assump- 



2012) 



tion (|Peha et al. 

The AMP and LWF interpretations of 
CGs are sometimes considered as compet- 
ing and, thus, their relative merits have 



been pointed out 


(Andersson et al.. 


2001 


Drton and Eichlei, 


2006; 


Levitz et al.. 


2001 


Roverato and Studenvl. 


2006). Note, however 



that no interpretation subsumes the other: 
There are many independence models that can 
be induced by a CG under one interpretation 
but that cannot be induced by any CG un- 



der t he other interpretation (jAndersson et al 
200ll . Theorem 6). 



The rest of the paper is organized as follows. 
Section [2] reviews some concepts. Section 
presents the algorithm. Section 2] proves its cor- 



rectness. Section [5] closes with some discussion. 
2 Preliminaries 

In this section, we review some concepts from 
probabilistic graphical models that are used 
later in this paper. All the graphs and probabil- 
ity distributions in this paper are defined over 
a finite set V. All the graphs in this paper are 
hybrid graphs, i.e. they have (possibly) both 
directed and undirected edges. The elements 
of V are not distinguished from singletons. We 
denote by \X\ the cardinality of X c V . 

If a graph G contains an undirected (resp. di- 
rected) edge between two nodes V\ and V 2 , then 
we write that V\ - V 2 (resp. V\ V 2 ) is in G. 
The parents of a set of nodes X of G is the 
set pa G (X) = {VxlVi -* V 2 is in G, V\ £ X and 
V 2 e X}. The neighbors of a set of nodes X of G 
is the set ne G (X) = {Vi\Vi - V 2 is in G,V\i X 
and V 2 fl}. The adjacents of a set of nodes X 
of G is the set ad G (X) = {Vi\Vi -> V 2 , V x -V 2 or 
Vi <- V 2 is in G, Vi £ X and V 2 e X}. A route 
from a node V\ to a node V n in G is a sequence of 
(not necessarily distinct) nodes V\ , . . . , V n such 
that Vi 6 adciVi+i) for all 1 < i < n. The length 
of a route is the number of (not necessarily dis- 
tinct) edges in the route, e.g. the length of the 
route V\ , . . . , V n is n - 1. A route is called a cy- 
cle if V n = V\. A route is called descending if 
Vi € paa(Vi+i) u nec{Vi + i) for all 1 < i < n. The 
descendants of a set of nodes X of G is the set 
deo(X) = {V n \ there is a descending route from 



Vi to V n in G, Vl e X and V n £ A}. A cycle 
is called a semidirected cycle if it is descend- 
ing and Vi -> Vi+i is in G for some 1 < i < n. 
A chain graph (CG) is a hybrid graph with no 
semidirected cycles. A set of nodes of a CG is 
connected if there exists a route in the CG be- 
tween every pair of nodes in the set st all the 
edges in the route are undirected. A connectiv- 
ity component of a CG is a connected set that 
is maximal wrt set inclusion. The connectivity 
component a node A of a CG G belongs to is 
denoted as co G (A). The subgraph of G induced 
by a set of its nodes X is the graph over X that 
has all and only the edges in G whose both ends 
are in X. An immorality is an induced subgraph 
of the form A -> B «- C . A flag is an induced 
subgraph of the form A -*■ B - C '. If G has 
an induced subgraph of the form A B <- C 
or A -> B - C, then we say that the triplex 
({A, C}, B) is in G. Two CGs are triplex equiv- 
alent iff they have the same adjacencies and the 
same triplexes. 

A node B in a route p is called a head-no- 
tail node in p if A B <- C, A -»■ B - C, or 
A - B <r- C is a subroute of p (note that maybe 
A = C in the first case). Let X, Y and Z denote 
three disjoint subsets of V. A route p in a CG 
G is said to be Z-open when (i) every head-no- 
tail node in p is in Z, and (ii) every other node 
in p is not in Z. When there is no route in G 
between a node in X and a node in F that is Z- 
open, we say that X is separated from Y given 
Z in G and denote it as X 1 G Y\Z. We denote 
by X I G Y\Z that X 1 G Y\Z does not hold. 
Likewise, we denote by X l p Y\Z (resp. A/ p 
Y\Z) that X is independent (resp. dependent) 
of Y given Z in a probability distribution p. The 
independence model induced by G, denoted as 
1(G), is the set of separation statements Jig 
Y\Z. We say that p is Markovian with respect 
to G when X 1 P Y\Z if X l G Y\Z for all X, 
Y and Z disjoint subsets of V. We say that p 
is faithful to G when X 1 P Y\Z iff X 1 G Y\Z 
for all X, Y and Z disjoint subsets of V. If 
two CGs G and H are triplex equivalent, then 
I(G)=I(H)E 



X l p Y\Z, weak union X l p 
X ± p Y\Z u IF, and contraction 



Let X, Y, Z and W denote four disjoint 
subsets of V. Any probability distribution p 
satisfies the following properties: Symmetry 
X ± p Y\Z => Y 1 pX\Z, decomposition Il p 
F u W|Z 
F u Wf|Z 

X 1 p F|Z u W a A l p W|Z => X 1 p F u W|Z. 
Moreover, if p is faithful to a CG, then it also 
satisfies the following properties: Intersection 
X± p Y\Z u W a X ± p W\Z u F => X l p Y u 
and composition A ± P F|Z a A l p W\Z => A i p 
FuW|Z@ 

3 The Algorithm 

Our algorithm, which can be seen in T able 
resembles the well-known PC algorithm (|Mee 
1995 : Spirtes et al. . 19931 ). It consists of two 



To see it, note that there are Gaussian distribu- 



phases: The first phase (lines 1-9) aims at learn- 
ing adjacencies, whereas the second phase (lines 
10-11) aims at directing some of the adjacencies 
learnt. Specifically, the first phase declares that 
two nodes are adjacent iff they are not sepa- 
rated by any set of nodes. Note that the al- 
gorithm does not test every possible separator 
(see line 6). Note also that the separators tested 
are tested in increasing order of size (see lines 
2, 6 and 9). The second phase consists of two 
steps. In the first step, the ends of some of the 
edges learnt in the first phase are blocked ac- 
cording to the rules R1-R4 in Table [2J A block 
is represented by a perpendicular line such as in 
i — or i — i, and it means that the edge cannot be 
directed in that direction. In the second step, 
the edges with exactly one unblocked end get 
directed in the direction of the unblocked end. 
The rules R1-R4 work as follows: If the condi- 
tions in the antecedent of a rule are satisfied, 
then the modifications in the consequent of the 
rule are applied. Note that the ends of some of 
the edges in the rules are labeled with a circle 

tions p and q that are f aithful to G and H, respec- 
tively (jLevitz et al.l . 120011 . Theorem 6.1). Moreover, p 
and q are Mark o vian wrt H and G, res pectively, by 
lAndersson et al.1 (|2001l . Theorem 5) and iLevitz et al.l 
(|2001l . Theorem 4.1). 

2 To see it, note that t here is a Gaussian distribu- 
tion that is faithful to G (jLevitz et al.l . 120011 . Theorem 
6.1). Moreover, every Gaussian distribution satisfi es the 
intersection and composition properties (Studeny, 2005, 
Proposition 2.1 and Corollary 2.4). 



Table 1: The algorithm. 



Table 2: The rules R1-R4. 



Input: A probability distribution p that is faithful 
to an unknown CG G. 

Output: A CG H that is triplex equivalent to G. 

1 Let H denote the complete undirected graph 

2 Set I = 

3 Repeat while possible 

4 Repeat while possible 

5 Select any ordered pair of nodes A and B in H 
st A 6 adH(B) and 

\[ad H {A) u ad H (ad H (A))] \ B\ > I 

6 If there exists some 

S E [ad H (A) u ad H (ad H (A))] \ B st \S\ = I and 
Al p B\S then 

7 Set Sab = Sba = S 

8 Remove the edge A - B from H 

9 Set 2 = I + 1 

10 Apply the rules R1-R4 to H while possible 

11 Replace every edge i— (hh) in H with -> (-) 



R1: io oBo oC* => A l oBo iC 



R2: A I °Bo oC* A I oB\ o G 

aBe Sac 




D D 



a Ae Scd a B (( Sod 



such as in i— o or o-o. The circle represents an 
unspecified end, i.e. a block or nothing. The 
modifications in the consequents of the rules 
consist in adding some blocks. Note that only 
the blocks that appear in the consequents are 
added, i.e. the circled ends do not get modified. 
The conditions in the antecedents of Rl, R2 and 
R4 consist of an induced subgraph of H and the 
fact that some of its nodes are or are not in some 
separators found in line 7. The condition in the 
antecedent of R3 is slightly different as it only 
says that there is a cycle in H whose edges have 
certain blocks, i.e. it says nothing about the 
subgraph induced by the nodes in the cycle or 
whether these nodes belong to some separators 
or not. Note that, when considering the appli- 
cation of R3, one does not need to consider in- 
tersecting cycles, i.e. cycles containing repeated 
nodes other than the initial and final ones. 

4 Correctness of the Algorithm 

In this section, we prove that our algorithm is 
correct, i.e. it returns a CG the given probabil- 
ity distribution is faithful to. We start proving a 
result for any probability distribution that satis- 
fies the intersection and composition properties. 



Recall that any probability distribution that is 
faithful to a CG satisfies these properties and, 
thus, the following result applies to it. 

Lemma 1. Let p denote a probability distribu- 
tion that satisfies the intersection and composi- 
tion properties. Then, p is Markovian wrt a CG 
G iff p satisfies the following conditions: 

CI: AL p coG(A)\A\ne G (A)\pa G (Aune G (A))u 
ne G (A) for all AzV, and 

C2: Al p V \ A \ de G (A) \ pa G (A)\pa G (A) for 
all A 6 V. 



Proof. It follows from lA ndersso n et al.l (2001 



Theorem 3) and iLevitz et al.l (|200ll . Theorem 
4.1) that p is Markovian wrt G iff p satisfies the 
following conditions: 

LI: A l p co G (A) \ A \ ne G (A)\[V \ co G (A) \ 
de G (co G (A))] u ne G {A) for all AtV, and 

L2: A 1 p V n co G {A) \ de G (co G (A)) \ 
pa G (A)\pa G (A) for all A^V. 

Clearly, C2 holds iff L2 holds because 
de G {A) = [co G (A) u de G (co G (A))] \ A. We 
prove below that if L2 holds, then CI holds iff 
LI holds. We first prove the if part. 



1. B 1 P V x co G (B) s deGf(coG(5)) n 
;?a G (.B)|pa G (-B) for all B e AuneG(i) by 
L2. 

2. Bi/nco g (B) Nde G (co G (S)) \pa G (Au 
ne G (A))\pa G (A u rae G (A)) for all B e A u 
ne G (j4) by weak union on 1. 

3. Au n.e G (A) 1 P V \ co G (A) \ cte G (co G (,4)) \ 
pa G (A u ne G (^))|pa G (^ u ne G (A)) by re- 
peated application of symmetry and com- 
position on 2. 

4. Al p V \ co g (j4) s de G (co G (A)) \ pa G (,4 u 
ne G {A))\pa G {A u ne G (A)) u ne G (A) by 
symmetry and weak union on 3. 

5. A 1 p co G {A) \ A \ ne G {A)\[V \ co G (A) \ 
de G (co G (A))] une G (i) by LI. 

6. Al p [co G (A) six ne G (A)][V \ co G {A) \ 
cfe G (co G (i)) \ pa G (A u rae G (A))]km G (A u 
ne G (A)) une G (A) by contraction on 4 and 
5. 

7. Al p co G (A)\A\ne G (i)|pa G (iune G (i))u 
ne G (A) by decomposition on 6. 

We now prove the only if part. 

8. Al p co G (A)\A\ne G (A)\pa G (Aune G (A))u 
ne G (A) by CI. 

9. Al p [V \ co G {A) \ de G (co G (A)) \ pa G (A u 
ne G (A))][co G (A) \ A \ ne G (A)]\pa G (A u 
ne G (A))une G (A) by composition on 4 and 
8. 

10. A i p co G (A) \ j4 \ ne G (A)|[y \ co G (A) \ 
de G (co G (A))] une G (A) by weak union on 
9. 

□ 

Lemma 2. After line 9, G and H have the 
same adjacencies. 

Proof. Consider any pair of nodes A and B 
in G. If A e ad G (B), then A / p .B|S for all 
5 £ V \ [i u B] by the faithfulness assumption. 
Consequently, A e adji(B) at all times. On the 
other hand, if v4 f! ad G (B), then consider the 
following cases. 

Case 1 Assume that co G {A) - co G (B). Then, 
A± p co G (A)\A\ne G (A)\pa G (Aune G (A))u 
ne G (A) by CI in Lemma Q] and, thus, 



A 1 pi?|pa G (A u ne G (A)) u ne G (A) by de- 
composition and B £ ne G (A), which fol- 
lows from A jS ad G {B). Note that, as 
shown above, pa G (^4une G (A)) une G (A) c 
[ad^(^4) u ad//(ad^(^4))] \ B at all times. 

Case 2 Assume that co G (A) * co G (B). Then, 
A £ de G (B) or B £ de G (A) because G has 
no semidirected cycle. Assume without loss 
of generality that B £ de G (A). Then, Al p 
V\A\ de G {A) \pa G (A)\pa G (A) by C2 in 
Lemma [T]and, thus, Al p B\pa G (A) by de- 
composition, B £ de G (A), and B £ pa G (A) 
which follows from A £ ad G (B). Note that, 
as shown above, pa G {A) £ adn(A) \ B at 
all times. 

Therefore, in either case, there will exist some 
S in line 6 such that A l p B\S and, thus, the 
edge A - B will be removed from H in line 7. 
Consequently, A £ adn(B) after line 9. □ 

The next lemma proves that the rules R1-R4 
are sound, i.e. if the antecedent holds in G, then 
so does the consequent. 

Lemma 3. The rules R1-R4 are sound. 

Proof. According to the antecedent of Rl, G has 
a triplex ({A, C},B). Then, G has an induced 
subgraph of the form A -> B <- C , A -> B - C 
or A — B <- C. In either case, the consequent of 
Rl holds. 

According to the antecedent of R2, (i) G does 
not have a triplex ({A,C},B), (ii) A -> B or 
A - B is in G, (hi) B e ad G (C), and (iv) A £ 
ad G (C). Then, B -» C or B - C is in G. In 
either case, the consequent of R2 holds. 

According to the antecedent of R3, (i) G has 
a descending route from A to B, and (ii) A e 
ad G {B). Then, ^4 -> or A-B is in G, because 
G has no semidirected cycle. In either case, the 
consequent of R3 holds. 

To appreciate the soundness of R4, assume 
to the contrary that A <- B is in G. Then, 
it follows from applying R3 to the antecedent 
of R4 that G has an induced subgraph that is 
consistent with 



C 



D 



I D 



Moreover, recall from the antecedent of R4 
that A 6 ScD-, which implies that G does not 
have a triplex ({C, D},A), which implies that 
A - C and A - D are in G. Thus, G has an 
induced subgraph that is consistent with 



C 



\ D 



D 



However, recall from the antecedent of R4 
that B f£ ScD, which implies that G has a triplex 
({C, D},B), which implies that G has a semidi- 
rected cycle, which is a contradiction. There- 
fore, A - B or A -*■ B is in G. In either case, the 
consequent of R4 holds. □ 

Lemma 4. After line 11, G and H have the 
same triplexes. Moreover, H has all the im- 
moralities in G. 

Proof. We first prove that any triplex in H is in 
G. Assume to the contrary that H has a triplex 
({A, C},B) that is not in G. This is possible 
iff, when line 11 is executed, H has an induced 
subgraph of one of the following forms: 



and 4 h B is not in H by Lemmas [2] and [3J 
Then, the triplex is in H. Note that the triplex 
is a flag in G but it may be an immorality in 
if. □ 

Lemma 5. After line 10, H does not have any 
induced subgraph of the form A f^Ts^ c 

Proof. Assume to the contrary that the lemma 
does not hold. Consider the following cases. 

Case 1 Assume that A i— » B is in H due to Rl. 
That is, H has an induced subgraph of one 
of the following forms: 



A I— o B — C A l— o B — C 

I 1/ 
D D 



case 1.1 



case 1.2 



Case 1.1 If B £ S C d then B -h C is in 

H by Rl, else B >- C is in if by R2. 

Either case is a contradiction. 
Case 1.2 If C i S A d then A t- C is in 

H by Rl, else B -. C is in if by R4. 

Either case is a contradiction. 

Case 2 Assume that A h- ° f? is in if due to R2. 
That is, if has an induced subgraph of one 
of the following forms: 



Ah^B — C Ah^B — C 




B^C 



\C A\—iB- 



1C 



D D ' 

case 2.1 case 2.2 



where B e Sac by Lemma [2J The first and 
second forms are impossible because, otherwise, 
A o—f B would be in H by R2. The third form 
is impossible because, otherwise, B >—< C would 
be in H by R2. 

We now prove that any triplex ({A, C},B) 
in G is in H. Let the triplex be of the form 
A -> B <r- C. Then, when line 11 is executed, 
A i ° B ° i C is in B\ by Rl, and neither A hh B 
nor B i— i C is in by Lemmas [2] and El Then, 
the triplex is in H. Note that the triplex is an 
immorality in both G and H. Likewise, let the 
triplex be of the form A -> B m C . Then, when 
line 11 is executed, A >— ° £? °— i C is in if by Rl, 





D D x 

case 2.3 case 2.4 

Case 2.1 If ^4 £ S C d then ^ -h C is in 

# by Rl, else A h- C is in by R2. 

Either case is a contradiction. 
Case 2.2 Restart the proof with D instead 

of A and A instead of B. 
Case 2.3 Then, A -. C is in if by R3, 

which is a contradiction. 
Case 2.4 If C f S B d then B >- C is in 

if by Rl, else S -< C is in if by R2. 

Either case is a contradiction. 



Case 3 Assume that A i— o B is in H due to R3. 
That is, H has an induced subgraph of one 
of the following forms: 




oB — C A 



oB — C A 



oB — C 




oB — C 



applying Rl that H has an induced 
subgraph of the form 




Note that A e Sde because, other- 
wise, R4 would not have been applied. 
Then, A >— C is in H by R4, which is 
a contradiction. 



Case 3.1 If B t S C d then B -h C is in 
H by Rl, else B \- C is in H by R2. 
Either case is a contradiction. 

Case 3.2 Restart the proof with D instead 
of A. 

Case 3.3 Then, B -■ C is in H by R3, 

which is a contradiction. 
Case 3.4 Then, A >- C is in H by R3, 

which is a contradiction. 

Case 4 Assume that A >— ° 5 is in i? due to R4. 
That is, il has an induced subgraph of one 
of the following forms: 



Ah^B — C A I— a B — C 




case 4.1 case 4.2 



A«B — C — C 




case 4.3 case 4.4 



Cases 4.1-4.3 If B i S C d or B i S C e 
then B i C is in H by Rl, else B \- C 
is in H by R2. Either case is a contra- 
diction. 

Case 4.4 Assume that C e Sde- Note 
that .£> ^ Sde because, otherwise, R4 
would not have been applied. Then, 
B i C is in B by R4, which is a con- 
tradiction. On the other hand, assume 
that C fS Then, it follows from 



□ 

Lemma 6. After line 10, every cycle in H that 
has an edge >— also has an edge — i. 

Proof. Assume to the contrary that H has a cy- 
cle p : Vi , V n = V% that has an edge i— but 
no edge — i. Note that every edge in p cannot 
be because, otherwise, every edge in p would 
be i— i by repeated application of R3, which con- 
tradicts the assumption that p has an edge 
Therefore, p has an edge - or — i. Since the lat- 
ter contradicts the assumption that the lemma 
does not hold, p has an edge -. Assume that 
p is of length three. Then, p is of one of the 
following forms: 

V 1 h^V2^V3 Fih^T— -V 3 ViCa^V^ 

The first form is impossible by Lemma[5j The 
second form is impossible because, otherwise, 
V2 — 1 V3 would be in H by R3. The third form 
is impossible because, otherwise, V\ 1— V3 would 
be in H by R3. Thus, the lemma holds for cycles 
of length three. 

Assume that p is of length greater than three. 
Recall from above that p has an edge - and 
no edge — 1. Let Vi+i - Vi+2 be the first edge - 
in p. Assume without loss of generality that 
i > 0. Then, p has a subpath of the form Vi >—° 
Vi+i - Vi+2- Note that V{ e adHiVi+i) because, 
otherwise, if V i+ i i S Vi v i+2 then V i+1 -n V i+2 
would be in H by Rl, else Vi+i >— Vi+2 would 
be in H by R2. Thus, H has an induced sub- 
graph of one of the following forms: 



Vi ho V i+ 1 - V i+2 V, ho V i+ x - V i+2 Vi ho V i+ i - V i+2 , 

The first form is impossible by Lemma[5j The 
second form is impossible because, otherwise, 
Vi+i i Vi + 2 would be in H by R3. Thus, the 
third form is the only possible. Note that this 
implies that g ■ V\ , . . . , Vi, Vi+2 , ■ ■ ■ , V n = V% is a 
cycle in H that has an edge >— and no edge — i. 

By repeatedly applying the reasoning above, 
one can see that H has a cycle of length three 
that has an edge i— and no edge — i. As shown 
above, this is impossible. Thus, the lemma 
holds for cycles of length greater than three 
too. □ 

Theorem 1. After line 11, H is triplex equiv- 
alent to G and it has no semidirected cycle. 

Proof. Lemma [2] implies that G and H have the 
same adjacencies. Lemma H] implies that G and 
H have the same triplexes. Lemma O implies 
that H has no semidirected cycle. □ 

5 Discussion 

In this paper, we have presented an algorithm 
for learning an AMP CG a given probability dis- 
tribution p is faithful to. In practice, of course, 
we do not usually have access to p but to a fi- 
nite sample from it. Our algorithm can easily 
be modified to deal with this situation: Replace 
A± P B\S in line 6 with a hypothesis test, prefer- 
ably with one that is consistent so that the re- 
sulting algorithm is asymptotically correct. 

It is worth mentioning that, whereas Rl, R2 
and R4 only involve three or four nodes, R3 
may involve many more. Hence, it would be 
desirable to replace R3 with a simpler rule such 
as 



Jk>Bi-oC 



Ah B h-o C 



Unfortunately, we have not succeeded so far 
in proving the correctness of our algorithm with 
such a simpler rule. Note that the output of 
our algorithm will be the same whether we keep 
R3 or we replace it with a simpler sound rule. 
The only benefit of the simpler rule may be a 
decrease in running time. 



We have shown in Lemma 2] that, after line 
11, H has all the immoralities in G or, in other 
words, every flag in H is in G. The following 
lemma strengthens this fact. 

Lemma 7. After line 11, every flag in H is in 
every CG F that is triplex equivalent to G. 

Proof. Note that every flag in H is due to an in- 
duced subgraph of the form A t— B i— i C. Note 
also that all the blocks in H follow from the 
adjacencies and triplexes in G by repeated ap- 
plication of R1-R4. Since G and F have the 
same adjacencies and triplexes, all the blocks in 
H hold in both G and F by Lemma [3l □ 



The 
terms 



lemma 
of 



m 



above implies that, 

Roverato and Studenv 
our algorithm outpu ts a deflagged graph. 
Roverato and Studenv ( 20061 ) also introduce 
the concept of strongly equivalent CGs: Two 
CGs are strongly equivalent iff they have 
the same adjacencies, immoralities and flags. 
Unfortunately, not every edge -*■ in H after line 
11 is in every deflagged graph that is triplex 
equivalent to G, as the following example 
illustrates, where both G and H are deflagged 
graphs. 

A B A B 

C — D — E C — D^E 



G 



II 



The refore, in terms of lRoverato and Studenv 
(|2006h . our algorithm outputs a deflagged 
graph but not the largest deflagged graph. 
The latter is a distinguished member of a 
class of triplex equivalent CGs. Fortunately, 
the largest deflagged graph can easily be ob- 
tained from any deflagged g raph in the class 
(|Roverato and Studenvl . 120061 . Corollary 17). 

The correctness of our algorithm lies upon the 
assumption that p is faithful to some CG. This 
is a strong requirement that we would like to 
weaken, e.g. by replacing it with the milder as- 
sumption that p satisfies the composition prop- 
erty. Correct algorithms for learning directed 
and acyclic graphs (a.k.a Bayesian networks) 



und er the composition property assumption ex 
ist (Chickering and Meek . 20021 : Nielsen et al 
20031 ). We have recently developed a correct 



algorithm for learnin g LWF CG s und er the 
composition property ( Peha et al. . 20121 ). The 



way in which these algorithms proceed (a.k.a. 
score+search based approach) is rather differ- 
ent from that of the algorithm presented in 
this paper (a.k.a. constraint based approach). 
In a nutshell, they can be seen as consisting 
of two phases: A first phase that starts from 
the empty graph H and adds single edges to 
it until p is Markovian wrt H, and a second 
phase that removes single edges from H until 
p is Markovian wrt H and p is not Markovian 
wrt any CG F st 1(H) £ 1(F). The success 
of the first phase is guaranteed by the composi- 
tion property assumption, whereas the success 
of the second phase is g uaranteed b y the so- 
called Meek's conjecture ( Meekl . [l997l ). Specif- 
ically, given two directed and acyclic graphs F 
and H st 1(H) £ 1(F), Meek's conjecture states 
that we can transform F into H by a sequence 
of operations st, after each operation, F is a 
directed and acyclic graph and 1(H) £ 1(F). 
The operations consist in adding a single edge 
to F, or replacing F with a triplex equivalent di- 
rected and acyclic gr aph. Meek' s conje cture was 
proven to be true in (|Chickerind . l200l Theorem 
4) . The extension of Meek's c onjecture to LWF 
CGs was proven to be true in ( Penal . 12011] . The- 
orem 1). Unfortunately, the extension of Meek's 
conjecture to AMP CGs does not hold, as the 
following example illustrates. 



B 



C 



D 
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Then, 1(H) = {X 1 H Y\Z : X 1 H Y\Z e 
h(H) u I 2 (H) v Y 1 H X\Z e h(H) u I 2 (H)} 
where h(H) = {A± H Y\Z :Y,Z£ BuCuEaD i 
Z} and I 2 (H) = {C 1 H Y\Z : Y £ BuEaAuD c 
Z}. One can easily confirm that 1(H) £ 1(F) 
by using the definition of separation. However, 
there is no CG that is triplex equivalent to F or 
H and, obviously, one cannot transform F into 
H by adding a single edge. 



While the example above compromises the 
development of score+search learning algo- 
rithms that are correct and efficient under the 
composition property assumption, it is not clear 
to us whether it also does it for constraint based 
algorithms. This is something we plan to study. 
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